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This is in response to the appeal brief filed July 14, 2004. 

(1) Real Party in Interest 

A statennent identifying the real party in interest is contained in the brief. 

(2) Related Appeals and Interferences 

A statennent identifying the related appeals and interferences which will 
directly affect or be directly affected by or have a bearing on the decision in the 
pending appeal is contained in the brief. 

(3) Status of Claims 

The statennent of the status of the clainns contained in the brief is correct. 

(4) Status of Amendments A fter Final 

The appellant's statennent of the status of amendnnents after final rejection 
contained in the brief is correct. 

The annendnnent after final rejection filed on 5/12/04 has been entered upon 
appeal. 

(5) Summary of Invention 

The sumnnary of invention contained in the brief is correct. 

(6) Issues 

The appellant's statement of the issues in the brief is correct. 
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(7) Grouping of Claims 

Appellant's brief includes a statement that claims 3 through 6 stand or fall 
together and provides reasons as set forth in 37 CFR 1.192(c)(7) and (c)(8). 

(8) Claims Appealed 

The copy of the appealed claims contained in the Appendix to the brief is 
correct. 

(9) Prior Art of Record 

5,574,824 Slyh et al. 11-1 996 

6,009,396 Nagata 12-1999 

6,061 ,646 Martino et al, 5-2000 

(10) Grounds of Rejection 

The following ground(s) of rejection are applicable to the appealed claims: 

Claim Rejections - 35 USC § 103 

Claims 3-4 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Martino et al., (6,061,646) in view of Slyh et a!., (5,574,824) and in view of 
Nagata (6,009,396). 
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As per claim 3, Martino et al., teach an apparatus comprising a self service 
kiosk, which dispenses articles, currency, or communication services (Col.1, lines 
41-60, Fig. 2). 

Martino et al., do not specifically teach, that within the kiosk, a steerable 
beam microphone array having multiple lobes, and means for sampling lobes, and 
distinguishing the difference between speech content and noise content from 
sound signals received from each lobe. Slyh et a!., teach a steerable beam 
microphone array (Col. 2, lines 44-45). Slyh et al., also teach distinguishing the 
difference between speech content and noise content from sound signals (Col. 5, 
lines 40-41, Col. 6, lines 32-37, SNR- Signal-to-Noise Ratio). Signal-to-Noise Ratio 
implies that the signal power and the noise power have been separately computed. 
In this case, the signal power is construed to be speech signal power. Therefore it 
would have been obvious to one with ordinary skill in the art at the time of 
invention to incorporate the beam microphone array as taught by Slyh et al., in the 
Kiosk for multiple spoken languages of Martino et al., because, this would provide 
the user with an improved system using a microphone array to enhance speech 
that has been corrupted by several directional interference signals and/or additive 
background noise. 

Martino et al., in view of Slyh et al., do not specifically teach identifying lobes 
having a relatively low noise content, or relatively high speech content, and 
actuating a lobe having both a relatively high speech content and relatively low 
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noise content. Nagata et al., in the same field of endeavor, teach a sound source 
position search unit that estimates a power arriving from each position (Col. 6, lines 
1-2, and 42-47, Col. 9, lines 29-35, Figs., 3 and 6). The sound source position 
search unit is the equivalent of ii) means for sampling lobes, since as described in 
the specification, a lobe is a plot of magnitude versus angular position. Nagata et 
aL, teach that all peaks above a threshold are detected as sound sources (Col. 10, 
lines 4-5). This is equivalent to identifying lobes having a relatively low noise 
content, i.e., detecting high speech content. Nagata et al., also teach a speech 
parameter extraction unit that extracts the power for each bandwidth and uses it 
as a speech parameter, which in turn is sent to the speech recognition unit (Col, 10, 
lines 24-27, Fig. 3). In the speech recognition unit, the speech power is calculated 
from the speech parameter (Col. 10, lines 32-33), i.e., identifying a signal with high 
speech content, and low noise content. 

Therefore, it would have been obvious to one with ordinary skill in the art at the 
time of invention to incorporate detecting speech content in a signal as taught by 
Nagata et al., in the apparatus of Martino et al.', in view of Slyh et al., because, an 
artisan would readily recognize that it would effectively place the beam of the 
microphone for higher degree of speech recognition, and would effectively 
suppress background noise and localize sound sources within a kiosk with a 
steerable microphone array having multiple lobes. 
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As per claim 4, Martino et al., in view of Slyh et al., and in view of Nagata et 
al., teach the apparatus as per claim 3, further comprising speech recognition 
means for recognizing speech contained in the lobe actuated. Nagata et al., teach 
the band-pass power of the sound source obtained sent from the speech parameter 
extraction unit to the speech recognition unit and used in the speech recognition 
processing (Col. 10, lines 24-31, Fig. 7). Therefore, it would have been obvious to 
one with ordinary skill in the art at the time of invention to include the means for 
recognizing speech contained in the lobe of a steerable microphone array, because 
one with ordinary skill in the art would recognize that this would process only that 
part of signal that has high speech content and low noise content for more precise 
speech recognition capability. 

Claims 5 and 6 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Martino et al., (6,061,646) in view of Nagata et al., (6,009,396). 

As per claim 5, Martino et al., disclose a method comprising the following 
step: maintaining a self-service kiosk which dispenses articles, currency, or 
communication services (Col.1, lines 41-60, Fig. 2). However, Martino et al., fail to 
teach: maintaining a beam-steerable microphone array at the self-service kiosk, 
measuring noise content and speech content of several lobes of the array, and, 
selecting the lobe which carries, larger speech signals than other lobes, and smaller 
noise signals than other lobes. Nagata et al., teach a speech recognition system 
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using a microphone array (Col.1, lines 37-44, Fig.1). Nagata et al., also teach that 
all the peaks on the sound source distribution above a predetermined threshold are 
detected as sound sources (Col. 10' lines 4-5). One with ordinary skill in the art 
would be able to measure noise content of the several lobes of the array, since 
Nagata et al., already distinguishes noise from sound in the signal coming from the 
microphone array, i.e., the threshold can be considered as noise floor, when the 
signal is above this threshold, then it is considered to be speech and the 
parameters extracted, and below is noise. Also, Nagata et al., teach a speech 
parameter extraction unit that extracts the power for each bandwidth and uses it 
as a speech parameter. This speech parameter is then sent to the speech 
recognition unit (Col. 10, lines 24-27, Fig. 3), in which speech signal power is 
calculated from the extracted speech parameter (Col. 10, lines 32-33). Therefore, it 
would have been obvious to one with ordinary skill in the art at the time of 
invention to modify the method of Martino et al., to further comprise maintaining a 
beam-steerable microphone array at the self-service kiosk, with the method taught 
by Nagata et al., of measuring noise content and speech content of several lobes of 
the array, and selecting a lobe which carries larger speech signals than other lobes 
and smaller noise signals than other lobes, because one of ordinary skill in the art 
at the time of invention would readily recognize that this would provide more 
accurate speech recognition by suppressing background noise and localizing sound 
sources effectively. Also, it would have been obvious to one with ordinary skill in 
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the art to realize that the combination of Martino et al., in view of Nagata et al., 
that the resultant filter configuration with a plurality of delay line taps is used so 
that the power in each direction or position is obtained for each frequency 
bandwidth necessary for the speech recognition, rather than obtaining the power in 
each direction or position for each frequency that was used conventionally, 
because the obtained power can be directly used for the speech recognition while 
the required amount of calculations used is greatly reduced (Nagata et al.. Col. 14, 
lines 60-67). 

As per claim 6, Martino et al., in view of Nagata et al., teach the method as 
per claim 5. Nagata et al., teach receiving signals from the lobe selected, and 
performing speech recognition on the data. Nagata et al., teach a speech 
recognition unit whereby speech power is calculated fro the speech parameters 
extracted by the speech parameter extraction unit, and a speech section detected 
by a speech section detection unit according to the speech power. Then a pattern 
matching unit carries out pattern matching with a recognition dictionary so that 
speech recognition is realized (Col. 10, lines 32-39). Therefore, it would have been 
obvious to one with ordinary skill in the art at the time of invention, to modify the 
method of Martino et al., to further comprise the step of receiving signals from the 
lobe selected, and performing speech recognition on the data, because one of 
ordinary skill in the art would readily recognize that this would allow speech 
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recognition on a selected part of where the signal is most likely carried as opposed 
to noise. 

(1 1) Response to Argument 

In response to applicant's argument that the examiner's conclusion of 
obviousness is based upon improper hindsight reasoning, it must be recognized that 
any judgment on obviousness is in a sense necessarily a reconstruction based upon 
hindsight reasoning. But so long as it takes into account only knowledge which 
was within the level of ordinary skill at the time the claimed invention was made, 
and does not include knowledge gleaned only from the applicant's disclosure, such 
a reconstruction is proper. See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 
(CCPA1971). 

In response to applicant's argument that there is no suggestion to combine 
the references, the examiner recognizes that obviousness can only be established 
by combining or modifying the teachings of the prior art to produce the claimed 
invention where there is some teaching, suggestion, or motivation to do so found 
either in the references themselves or in the knowledge generally available to one 
of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. 
Cir. 1988), and In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992). In 
this case, it would have been obvious to one with ordinary skill in the art at the 
time of invention to incorporate detecting speech content in a signal as taught by 
Nagata et al., in the apparatus of Martino et al., in view of Slyh et al., because, an 



Application/Control Number: 09/731,084 Page 10 

Art Unit: 2654 

artisan would readily recognize that it would effectively place the nnicrophone for 
higher degree of speech recognition, and would effectively suppress background 
noise and localize directional sound sources within a kiosk with a steerable 
nnicrophone array having nnultiple lobes. 



For the above reasons, it is believed that the rejections should be sustained. 
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